Improving cluster analysis by co-initializations
نویسندگان
چکیده
Many modern clustering methods employ a non-convex objective function and use iterative optimization algorithms to find local minima. Thus initialization of the algorithms is very important. Conventionally the starting guess of the iterations is randomly chosen; however, such a simple initialization often leads to poor clusterings. Here we propose a new method to improve cluster analysis by combining a set of clustering methods. Different from other aggregation approaches, which seek for consensus partitions, the participating methods in our method are used consequently, providing initializations for each other. We present a hierarchy, from simple to comprehensive, for different levels of such co-initializations. Extensive experimental results on real-world datasets show that a higher level of initialization often leads to better clusterings. Especially, the proposed strategy is more effective for complex clustering objectives such as our recent cluster analysis method by low-rank doubly stochastic matrix decomposition (called DCD). Empirical comparison with three ensemble clustering methods that seek consensus clusters confirms the superiority of improved DCD using co-initialization.
منابع مشابه
Data Clustering Using Evidence Accumulation
We explore the idea of evidence accumulation for combining the results of multiple clusterings. Initially, n d−dimensional data is decomposed into a large number of compact clusters; the K-means algorithm performs this decomposition, with several clusterings obtained by N random initializations of the K-means. Taking the cooccurrences of pairs of patterns in the same cluster as votes for their ...
متن کاملGeneralizing and Improving Weight Initialization
We propose a new weight initialization suited for arbitrary nonlinearities by generalizing previous weight initializations. The initialization corrects for the influence of dropout rates and an arbitrary nonlinearity’s influence on variance through simple corrective scalars. Consequently, this initialization does not require computing mini-batch statistics nor weight pre-initialization. This si...
متن کاملBias-Correction Fuzzy C-Regressions Algorithm
In fuzzy clustering, the fuzzy c-means (FCM) algorithm is the most commonly used clustering method. However, the FCM algorithm is usually affected by initializations. Incorporating FCM into switching regressions, called the fuzzy c-regressions (FCR), has also the same drawback as FCM, where bad initializations may cause difficulties in obtaining appropriate clustering and regression results. In...
متن کاملOn Initializations for the Minkowski Weighted K-Means
Minkowski Weighted K-Means is a variant of K-Means set in the Minkowski space, automatically computing weights for features at each cluster. As a variant of K-Means, its accuracy heavily depends on the initial centroids fed to it. In this paper we discuss our experiments comparing six initializations, random and five other initializations in the Minkowski space, in terms of their accuracy, proc...
متن کاملBias-correction fuzzy clustering algorithms
Keywords: Cluster analysis Fuzzy clustering Fuzzy c-means (FCM) Initialization Bias correction Probability weight a b s t r a c t Fuzzy clustering is generally an extension of hard clustering and it is based on fuzzy membership partitions. In fuzzy clustering, the fuzzy c-means (FCM) algorithm is the most commonly used clustering method. Numerous studies have presented various generalizations o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Pattern Recognition Letters
دوره 45 شماره
صفحات -
تاریخ انتشار 2014